Towards a Decentralized Algorithm for Mapping Network and Computational 
Resources for Distributed Data-Flow Computations 

Shah Asaduzzaman and Muthucumaru Maheswaran 
Advanced Networking Research Lab 
School of Computer Science 
McGill University 
Montreal, QC H3A 2A7, Canada 
{as ad, maheswar}@cs . mcgill . ca 



Abstract 

Several high-throughput distributed data-processing ap- 
plications require multi-hop processing of streams of data. 
These applications include continual processing on data 
streams originating from a network of sensors, composing 
a multimedia stream through embedding several compo- 
nent streams originating from different locations, etc. These 
data-flow computing applications require multiple process- 
ing nodes interconnected according to the data-flow topol- 
ogy of the application, for on-stream processing of the data. 
Since the applications usually sustain for a long period, it 
is important to optimally map the component computations 
and communications on the nodes and links in the network, 
fulfilling the capacity constraints and optimizing some qual- 
ity metric such as end-to-end latency. The mapping problem 
is unfortunately NP-complete and heuristics have been pre- 
viously proposed to compute the approximate solution in a 
centralized way. However, because of the dynamicity of the 
network, it is practically impossible to aggregate the correct 
state of the whole network in a single node. In this paper, 
we present a distributed algorithm for optimal mapping of 
the components of the data flow applications. We propose 
several heuristics to minimize the message complexity of the 
algorithm while maintaining the quality of the solution. 

1. Introduction 

Real-time processing of continuous data streams are be- 
coming an important component of data-flow intensive dis- 
tributed applications. In general these applications consist 
of a few cascades of computational operations on several 
streams of data originating from one or more sources and 
presenting a view of the processed data at one or more 
sink nodes. Applications such as continual query [4] on 
the stream of information sent by a network of sensors. 



composing a multimedia stream through several stages of 
encoding, decoding and embedding [3, 9], scientific work- 
flow [6], etc. belong to this category. These applications 
require several computational resources along the path the 
data streams travel from the source to destination. In ad- 
dition, as each of these computations generate new data 
streams that are to processed by other computations or to be 
delivered to the destination. Sufficient network link band- 
width must be provided to carry these data streams among 
source, destination and computational nodes, so that the 
computations can proceed seamlessly. In this paper, we deal 
with the problem of optimally allocating computational and 
network resources for these distributed applications. 

Usually the distributed computation operates for a long 
time after being set up with all the necessary resources. So, 
it is important to optimally acquire the resources before the 
operation starts. When resources are requested for a dis- 
tributed job, the topology that interconnect the component 
nodes of the flow, i.e. the data sources, the processing nodes 
and the destination, is known. In very general terms, the in- 
terconnection topology can be an acyclic graph. However, 
in most common cases the flow is a linear path or tree or 
a series-parallel graph. We show in Section 2.3 that even 
for a linear path-like flow, finding a mapping that computa- 
tions on processing nodes and data transmissions on net- 
work paths, satisfying the processing capacity and band- 
width constraint, is an NP-complete problem. In this paper, 
we develop a scheme to solve the problem of mapping lin- 
ear path-like computation on an arbitrary resource network. 

The problem of establishing a path between a source and 
a destination node in an arbitrary network, subject to some 
end-to-end quality constraints, has been a topic for active 
research for a long time. If such path is to be established 
to satisfy one additive quality requirement such as delay or 
hop-count, the problem can easily be solved by Dijkstra's 
shortest path algorithm. Even if some end-to-end min-max 



constraint such as bandwidth need to be satisfied, still the 
problem can be solved easily using Wang and Crowcroft's 
shortest- widest path algorithm [10]. However, it is well 
known that establishing a path satisfying more than one ad- 
ditive quality constraints is an NP-hard problem [1, 8J. It 
is important to note that the problem of finding a mapping 
for a data-flow computation requires more than end-to-end 
constraints, because computational capacity of each of the 
nodes need to be individually satisfied. 

Due to the inherent complexity of the optimization prob- 
lem, several workable heuristic solutions have been pro- 
posed in different contexts. A recursive mapping on a hi- 
erarchy of node-groups in the resource networks is applied 
in [4]. In [9] and [3], mapping is performed after prun- 
ing the whole resource network into a subset of compatible 
resources. The solution by Liang and Nahrstedt [5] is clos- 
est to ours. One of the assumptions made by Liang and 
Nahrstedt was that the optimization algorithm was executed 
in a single node and complete state of the resource network 
is available to that node before execution. In a large scale 
dynamic network this assumption is hard to realize. If we 
assume that each node in the resource network is aware of 
the state of its immediate neighborhood only, we need to 
compute the solution using a distributed algorithm. In this 
paper we present a distributed algorithm to solve the prob- 
lem, which is a dynamic programming based extension of 
the distributed Bellman-Ford algorithm. 

The rest of the paper is organized as follows. In Sec- 
tion 2 of this paper we formally define the resource allo- 
cation problem as a constrained graph mapping problem. 
The Bandwidth Constrained Path Mapping (BCPM) prob- 
lem that covers most of the practical apphcations, is then 
defined as a special case of the general graph mapping prob- 
lem. We provide a formal proof of NP-completeness of the 
BCPM problem in the same section. In Section 3, central- 
ized and decentralized algorithms to solve the BCPM prob- 
lem are developed. A guideline for designing cost-effective 
heuristics to obtain approximate solutions to the problem is 
provided at the end of the same section. The discussion is 
then summarized with directions for possible future exten- 
sions in Section 4. 

2. Problem Formulation 

In this section we formally define the problem of capac- 
ity constrained mapping of dataflow computations on arbi- 
trary networks. Any distributed dataflow computation can 
be defined using three types of nodes and interconnection 
between them. Source nodes are the data sources originat- 
ing the data streams. Computing nodes are places where 
some computational operation on one or more incoming 
data-stream is performed continually, and an output stream 
is generated. Sink nodes are the places where the resulting 
flow from the computation is presented. In a very general 



case, a dataflow computation consists of one or more source 
nodes, one or more sink nodes and zero or more computing 
nodes. The topology of data-flow among these nodes is a 
directed acyclic graph (DAG). Although, theoretically it is 
possible to have dataflow computations that have loops or 
cycles, there will be finite number of iterations of the data 
through the cycles and these iterations can be expanded into 
finite acyclic graphs. In most common cases however, the 
dataflow topology is a simple path consisting of a series of 
computing nodes, or a tree where data-streams from multi- 
ple sources merged through several steps and presented at a 
single sink. 

The network of computing and data-forwarding re- 
sources where the distributed dataflow computation is to be 
instantiated can be represented by an arbitrary graph. We 
denote this graph as resource graph. Each node of the re- 
source graph has a certain computational capacity and each 
edge (link) of the resource graph has certain data transmis- 
sion capacity or bandwidth. In addition, each link may have 
one or more additive quality metric, such as latency, jitter, 
etc. 

2.1. Capacity Constrained Graph Mapping 
Problem 

In order to launch the distributed application on the net- 
work of computers, we need to map the dataflow-DAG onto 
the resource graph such that the computational and trans- 
mission requirements are fulfilled. If there is more than one 
such feasible mapping, one would like to choose the map- 
ping that has minimum end-to-end delay on the resource 
network. 

More formally, we need to map a dataflow-DAG Gj — 
{Vj,Ej) on to a resource graph Gr = {VR,Eii). For 
each vertex vr G Vr, an available computational capac- 
ity Ca,, (v/v) is given. For each edge cr G Er, an avail- 
able bandwidth Bav{eR) is given. In addition, each edge 
Cfl € Er has an additive weight. For each vertex vj G Vj, 
a computational requirement Creq{vj), and for each edge 
ej G Ej, a bandwidth requirement Breq{ej) is defined. 
There is a set of designated source nodes Sj C Vj = 
{si.j, S2J, Sjnj} and a set of sink nodes Tj C Vj = 
{hj, t2j, —tnj}, such fliat Sjr\Tj = 4>. 

The bandwidth constrained DAG-mapping problem 
(BCDM) is to find a mapping M : Vj ^ Vr. For 
each source node Sjj, M[sij) = sir and for each sink 
node tij, M{tij) = Ur are already given. It is impor- 
tant to note that multiple nodes of the dataflow-DAG can 
map onto single node of the resource graph and a single 
edge in the dataflow-DAG can span along a multi-hop path 
in the resource graph. So, defining the Vj Vr map- 
ping is not sufficient to define the mapping of complete 
dataflow-DAG. In addition to vertex mapping, another map- 
ping Me : Ej ^ Pr is needed, where Pr is the set of aU 



possible paths in the resource graphs, including zero length 
paths. Zero length paths are {v, v) edges with infinite band- 
width and zero latency. Again, it is possible that for two dif- 
ferent edges, ei, 62 £ Ej, the mapped paths pi — Afe(ei) 
and p2 = Me(e2) may have some common edges. 

The mapping should fulfill the following constraints - 

yvR e M{Vj) 

Creq{v.]) < Cavivn) 

{v,j\vj^V,],M(vj) = Vr} 

Me J = G Ej, 

B{ej) < min[B{er),er £ Me(ej)] 

We call this problem as Bandwidth Constrained DAG Map- 
ping problem (BCDM). 

When each edge G Er in the resource graph has an 
additive metric D{vr), such as delay, cost, jitter, etc., we 
would like to find the feasible mapping that minimizes the 
total cost 




Figure 2. An example data-flow computation 
with a DAG topology 
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Figure 1. An example resource network 

Figure 1 shows an example resource network of eight in- 
terconnected computing nodes. Computational capacity of 
each node is represented by a number inside the node. The 
link bandwidth and latency are mentioned on each edge. 
Figure 2 shows a dataflow-DAG containing 2 source nodes 
si and S2, 2 computing nodes xi and X2, and one sink node 
t. si, S2, and t must be mapped on resource node A, B, and 
F, respectively. Each node in the dataflow-DAG has some 
processing capacity requirement which is mentioned inside 
the node. Each link is also annotated with a bandwidth re- 
quirement. A feasible mapping of this dataflow-DAG on the 
resource graph is - 



2.2. Constrained Path Mapping Problem 

Although in very general terms the dataflow computa- 
tion resembles a DAG topology, in most practical cases the 
topology is a simple path. Given that the mapping of a DAG 
efficiently on the resource network with all the constraints 
satisfied is hard to solve, it is useful to to tackle the sim- 
pler problem of bandwidth constrained path mapping prob- 
lem (BCPM) first. In BCPM, the topology of the data flow 
computation is restricted to a directed loop-free path, with 
a single source and a single sink. 

Precisely, we are given a dataflow path Pj = {Vj, Ej), 
Vj = vo = s,vi,V2,...,v„i = t and Ej = {e^ = 
(wi,Ui+i)|0 < i < m} to map on the resource graph 
Gb. = {Vr,Eii) defined in the previous section. Each 
node Vi,0 < i < m of the program path has a com- 
putational capacity requirement Greq{vi), and each edge 
Si = {vi,Vi+i),0 < i < m has a bandwidth requirement 
Breqisi). We need to find the mappings M : Vj ^ Vr and 
Me : Ej Er that satisfies the constraints. Mapping of s 
and t is already given. 

An example dataflow path with one source s, one sink t 
and three computational nodes xi, X2, X3 is shown in Fig- 
ure 3, with the node capacity and bandwidth requirements. 
s and t must be mapped on B and F, respectively. There 
can be many feasible mappings of this dataflow computa- 
tion on the resource graph in Figure 1 . One of them is - 




Figure 3. An example data-flow computation 
with a path topology 
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which is also optimal in terms of total end-to-end latency 
of the resource nodes M{s) and M{t). 

2.3. Computational Complexity of the 
Problem 

We will now prove that BCPM problem is NP-complete. 
Since, BCPM is a special case of BCDM, NP-completeness 
of BCPM iplies that BCDM is an NP-hard problem. The 
NP-completeness proof of the BCPM problem is con- 
structed by transformation of the Longest Path problem [2]. 
Definition of the decision version of the Longest Path prob- 
lem is as follows - 

Instance: A graph G = (V, E), a length function I : 
E — > Z+, specified vertices s, t e y and a positive integer 
K. Question: Is there an (s t) simple path P C G such 

thatEeeP^(e) >i^? 

It is known that Longest Path problem is NP-complete, 
even for a special case, where Vee£;^(e) = 1 [2]. We will 
show that any instance of this special Longest Path problem 
can be polynomially transformed into an instance of BCPM. 

2.3.1. Longest Path oc BCPM 

We construct an instance of BCPM as follows - 

We take Gr{Vr,Er) - GiV,E), y^ev^Caviv) = 1, 
VeeEnBavie) = I. Take a simple path Pj = (Vj, such 

that \Vj\ = K, VveVjCreqiv) = 1 and Ve&EjBreqie) = 1- 

Now, if there is a simple (s ^ t) path of length > i^T in 
G, then that path must have K hops, since Veg£;/(e) — 1. 
Therefore, we can map Pj along the corresponding path 
Pji in Gr. If \P,]i\ > K, then we can map first K — 1 
nodes of Pj on Pj/ and map the remaining edge uk-i, uk 
on the V ^ t subpath of Pjt, where uk-i is mapped on v. 



Given a mapping of the path Pj on a path Pjt C Gr 
that satisfies the capacity and bandwidth requirement con- 
straints, \Pji\ must be K, because no two vertices of 
Pj can be mapped on a single vertex of \P,ji\ given the 
abovementioned capacity constraints. 

2.3.2. BCPM e NP 

Given an arbitrary mapping M : Vj ^ Vr one can polyno- 
mially verify - 

• Whether Creq{v) < Cav{M{v)), for all v £ Vj. 

• For each edge {u, v) G Pj, whether there is a 
{M{u) M{v)) path in Gr that satisfies the band- 
width constraint of {u, v) (Similar to bandwidth con- 
strained shortest path problem [10]). 

This completes the proof that BCPM e NP-C. 

3. Algorithm for path mapping problem 

To solve the BCPM problem, we developed an algo- 
rithm using the the Bellman-Ford relaxation scheme. First, 
we present the centralized version of the algorithm, where 
the whole mapping is computed by a single node that has 
knowledge of the state of the whole network of nodes. 
Later, we explain the development of the distributed algo- 
rithm based on this centralized one. 

This algorithm works by relaxing along each edge of the 
resource graph — 1 times, where N = \ Vr \, the number of 
nodes in the resource graph. For each node u of the resource 
graph, a set of feasible mappings of different length prefixes 
of the dataflow-path on any resource path from the source 
node s to the current node, is maintained. In each relaxation 
along an {u, v) edge, any new feasible map on (s -w u) is 
extended in all possible ways, to complete the list of fea- 
sible maps of dataflow path-prefixes on the resource path 
(s ~^ u, v) and these new partial mappings are added to the 
set maintained for node v. After — 1 iterations of relax- 
ation of all edges, the map set maintained for terminal node 
t contains all the feasible mappings of the dataflow-path on 
any (s ~^ t) resource path. The algorithm is presented in 
Algorithm 1, 2 and 3. A formal proof of the correctness 
of the algorithm is presented in the following sub-section. 
Lines 10-12 of the subroutine Relax is added to terminate 
the algorithm as soon as one feasible (s ^ t) mapping is 
found. These lines should be omitted when optimal map- 
ping is sought. 

We have computed the computational complexity of the 
algorithm in Section 3.2. The complexity is bounded by 
polynomial of the size of the partial map set S, although the 
set size is exponential. The problem being NP-hard, it is im- 
possible to have a polynomially bounded optimal algorithm. 
However, heuristics may be applied to produce sub-optimal 
solutions within a tractable amount of complexity. A good 



way of designing such heuristics is to restrict the size of the 
map-set in some way. In Section 3.4 we have discussed sev- 
eral possible heuristics to solve the BCPM problem. Note 
that because the set of partial map is stored in each node, the 
memory complexity of the algorithm becomes exponential 
too. This can be avoided by omitting the storage of partial 
maps. Each partial map need to be stored for one iteration of 
relaxation only. If partial maps are deleted after relaxation, 
the set size never grows beyond 0{dp), where, d is the av- 
erage indegree of a node in resource graph andp = l-P/l is 
the number of nodes in the dataflow path. 

Algorithm 1 Pathmap(Pj, Gr) 
1 

2 
3 

4: 
5 
6: 
7 
8 
9 

10: 
11 
12 
13 
14 
15 
16 
17 



Algorithm 2 subroutine Relax(u,v) 



forx = Oto \Pj\ - 1 do 

if Eo<fe<x (^reqik) < Cav{s) then 

M(i, x) = {m|m maps initial x nodes of Pj on 

s] 
else 

break 
end if 
end for 

for each vertex v & Vr — s Ao 

fori = to \Pj\ do 
M{v,i) = (f) 

end for 
end for 

fori = lto|VR|-ldo 

for each edge e = (u, v) G Er do 
Relax(u,v) 

end for 
end for 



3.1. Correctness of BCPM algorithm 

In this section we give a formal proof that when BCPM 
algorithm terminates, M{t,\Pj\) always contains a feasible 
mapping of Pj on Gr if and only if such a feasible mapping 
exists. 

Lemma 3.1. IfM{u) = IJy^ M{u,j) contains all feasible 
mappings of different length prefixes of Pj on an path (s 
u) e Gr, then after computing Relax {u,v), M{v) includes 
all feasible mappings of different length prefixes of Pj on 
the path {s u,v) <E Gr. 

Proof. By the construction of the Relax(u, v) subroutine, 
each mapping m G M{u, j), of a j-length prefix of Pj on 
a (s u) path, is extended over the {u, v) edge exactly 
once. Any possible mapping of a fc-length prefix of Pj on 
the (s u, v) path can be divided into 2 sub-mappings: 
a mapping of j-length prefix (j < k) of Pj on (s ^ u) 
path and a mapping of the following k — j vertices of the 
fc-length prefix on v. Since all feasible sub-mappings of the 
first kind is included in M(«) and all the extensions of the 



9: 
10: 
11 

12: 
13 
14: 
15 
16: 
17 
18 
19: 
20: 
21 
22 
23 
24: 
25 
26: 



for j = Oto \Pj\ do 
Mtmpii) = null 
end for 

forj = Oto|Pj|-ldo 

itBreqijJ + 1) < Bav{u,v) then 

for each new mapping m G M(u,j) in the last 
iteration do 
if V == t then 

= Extend(m, j, \Pj\ - j, v) 
M{v,\Pj\) = M{v,\Pj\)Um, 
i{M{v,\Pj\) 9^ (/-then 

terminate the algorithm with M{v,\P\) as 
result 
end if 
else 

fora; = Oto|Pj|-i-ldo 
THx = Extend(m, j, x, v) 
if nix 7^ null then 

M{v, j +x) = M{v, j +x)U nix 
else 

break 
end if 
end for 
end if 

mark m as old 
end for 
end if 
end for 



second kind is considered in lines 8 to 14 and 15 to 22 of 
Relax{u,v), M{v) contains all feasible mappings of any 
prefix of Pj on (s u, v) paths. □ 

Lemma 3.2. For any node v gVr if there is a s v path 

(vq = s,vi,V2, ■■■,Vk = v) of length k, after kth iteration 
of the outer for loop in line 7 of the PathMap algorithm, 
all feasible mappings of different length prefixes of Pj on 
the {vo ~^ Vk) path has been recorded in M{v). 

Proof. We will prove by induction on fc. When fc = 0, i.e. 

after the initialization phase, M{vq, i) or M(s, i),0 < i < 
\Pj I contains the feasible i-length prefix with first i vertices 
of P mapped on s. So the basis is true. 

Now let us assume that after i — 1 iterations, < i < fc, 
M{vi-i) contains all feasible mappings of different lengths 
on the (s Wj-i) portion of the (s Vk) path. Since 
each edge in Er is considered once in each iteration, 
Relax{vi-i,Vi) must be called in the ilh iteration too. So, 
by Lenmia 3 . 1 , we can conclude that all feasible prefix map- 
pings of Pj on the (s ~^ Vi) path is included in M{vi). □ 

Theorem 3.3. After | Vr| — 1 iterations of the outer loop in 
line 1 algorithm Pathmap, for each node v e Vr, M{v) 



Algorithm 3 subroutine Extend(m,j, x, v) 

1: if Ei<A)<x CreqU + k) < C'aviv) then 

2: extend m by putting computations {j + l,j + 

2, .... 7 + .t} in node v 
3: let nix be the extended mapping 
4: else 

5: TOa; = null 

6: end if 
7: return 



contains all feasible mappings of different length prefixes of 
Pj on all possible s v paths. 

Proof. Since there is no simple path longer than \Vji\ — 1, 
according to Lenoma 3.2, all such paths will be covered by 
the Relax procedure after IVr] — 1 iterations. □ 

The fact that after termination of Pathniap, M{t) con- 
tains all the feasible maps of Pj on possible (s t) paths, 
follows directly from Theorem 3.3 with inclusion of lines 7 
to 12 in the Relax procedure. 

3.2. Complexity of the algorithm 

The problem size parameters are IVr] = n, \Er\ = e 
and \Pj\ = p. The outer loop of Pathmap is iterated n — 1 
times and each iteration considers each of the e edges ex- 
actly once. So, the Relax procedure is called ne times. In 
each relaxation over an edge {u.v), each of the p prefix 
mappings from M{u) is tried for relaxation into some of 
the p mappings in M{v). A j length prefix in M{u,j) is 
tried for relaxation into p — j of the M{u,i), j < i < p, and 
each trial requires (i — j) computations of constant com- 
plexity for the extension. Let S be the maximum number of 
entries in the set of mappings M{u, j),u G Vr, < j < p. 
Note that only the new entries are relaxed in each iteration. 
However, the upper bound on the number of entries relaxed 
per M{u,j) will be S. So, the complexity of Relax(u,v) is 




So, the overall time complexity of the algorithm becomes 
0{nep^S). We see that the sets M{u,j) are creating the 
major load on both time and memory complexity of the al- 
gorithm. Therefore, restricting the growth of S within poly- 
nomial Umit would possibly result in a polynomial time ap- 
proximation algorithm. 



3.3. Distributed version of the algorithm 

The centrahzed algorithm can be easily extended to a 
distributed version, where each node u in the resource net- 
work Gh will maintain the data structure M{u) of partially 
computed mappings. Also, node u will be responsible for 
computing the relaxation to each of its neighbors v in Gr. 
The extended mappings are then transmitted to v. The re- 
laxation procedure is invoked by a node u when any new 
mapping arrives from any of its incoming neighbors. The 
algorithm is formally laid out in Algorithm 4. Upon ar- 
rival of a map message ni, a node u process the message 
using the algorithm ProcessMap(u, m). It follows from the 
correctness of the centralized algorithm that the distributed 
mapping completes after at most N — 1 ProcessMap invo- 
cation by each node in the graph. The distributed mapping 
algorithm can be terminated by force as soon as the terminal 
nodes receives a complete mapping. Otherwise, the algo- 
rithm terminates after all the outstanding ProcessMap have 
been completed. Since cycles are avoided during extension, 
an initial mapping may be extended at most N — 1 times. 
Thus there will be a finite number of ProcessMap invoca- 
tion and the algorithm will terminate after a finite amount 
of time. 



Algorithm 4 ProcessMap(u, m) 

1: Map message contains the mapping of computation 
nodes 0,1,2, ... , j on resource nodes. The first mes- 
sage to a node contains the requirement definition of 
the computation too 

2: j = \m\ 

3: if u == t then 

4: nix — Extend(m, j, \Pj\ — j, u) 
5: if nix 7^ null then 

6: terminate the algorithm with nix as result 
7: end if 
8: else 

9: fora; = Oto |Pj| - j - Ido 
10: nix = Extend(m, j, x, u) 
11: if nix 7^ null then 

12: for each neighbor vofu that is not already in m 

do 

13: if BreqU +x,j +X + 1) < Bav{u,v) then 

14: extend nix to nixX by appending a map of 

computations on node v 
15: send nixX to v 

16: end if 

17: end for 

18: end if 
19: end for 
20: end if 



3.4. Heuristic Approaches to Reduce Com- 
plexity 

Computational complexity of both the centralized and 
the distributed path mapping algorithm grows exponentially 
with the problem size. Therefore, for practical deployment, 
we need some heuristic that produces good approximation 
to the optimal result. Here we discuss three possible heuris- 
tics that modifies the original algorithm to reduce computa- 
tional, messaging and memory complexity. 

3.4.1, LeastCostMap 

One major source of growth in complexity of the algorithm 
is the exponential growth of the set of partial maps main- 
tained for each node. In the LeastCostMap heuristic, only 
one partial map of each prefix-length is maintained for each 
node. If a new map is generated, the cost of the new map 
in terms of the additive quaUty metric is compared with that 
of the already stored one, and the map with higher cost is 
discarded. This policy reduces the complexity to 0{p^). 

Similar pohcy can be apphed to the distributed version 
of the algorithm. However, in the distributed case, a map 
message is expanded to its neighbors as soon as the message 
is received. So, if a higher cost map message is arrived 
before a lower cost one, the processing of the higher cost 
message cannot be pruned. However, in most cases, higher 
cost messages arrive later, so they are pruned. 

We have implemented both the centralized and dis- 
tributed version of the original algorithm and also the Least- 
CostMap heuristic. The algorithms are then applied on ran- 
dom topologies generated by the BRITE Internet topology 
generator [7] and randomly generated dataflow paths. Due 
to the huge computational complexity of the exact algo- 
rithm, it was not possible to run it for networks larger than 
50 nodes. For these networks, the heuristic is able to find the 
optimal solution in 99% of the cases, with 100 to 1000 fold 
reduction in the size of the set of partial maps. For similar 
topologies, the distributed version of the heuristic produced 
optimal result in more than 99% cases and total number of 
message exchange was reduced approximately 100 fold. 

3.4.2. AimealedLeastCostMap 

One way of trading off between optimality and complexity 
of the LeastCostMap heuristic is to apply a simulated an- 
nealing approach to decide whether to discard a higher cost 
partial map from the set in presence of a lower cost map. 
As the temperature of the process anneals, i.e. at the later 
iterations, the probability of keeping a non-minimal partial 
solution will decrease. Definitely this approach increases 
the computation and message complexity. However, this al- 
lows some of the non-minimal partial solutions to grow and 
possibly lead to a better complete solution. 



3.4.3. RandomNeighbor 

Another way of restricting the message complexity is to 
extend any partial map to a randomly chosen subset of k 
neighbors instead of expanding to all of them. Higher val- 
ues of k increases the chance of getting the optimal solution. 
The RandomNeighbor heurisiic with k = 1 did not produce 
results as good as LeastCostMap, although number of mes- 
sages were reduced dramatically. Further investigation need 
to be done to determine a suitable value of k. 

4. Conclusion 

In this paper we have developed and explained a decen- 
tralized algorithm to compute the optimal mapping of com- 
putational capacity and network bandwidth requirement of 
a data-flow computation. Many high-throughput scientific 
research platforms need to support appUcations that resem- 
ble data-flow computation. The discussion presented in this 
paper provides in-depth understanding of the resource allo- 
cation problem for such computations and demonstrates the 
way to develop cost-effective solutions. At this point, the 
algorithm supports computations with path-topology only. 
Several interesting applications such as complex contin- 
ual queries on data stream originating from multiple sites, 
resemble a tree topology. A possible extension of this 
work is to modify the algorithm such that mapping of flow- 
computations with different topologies can be obtained. 
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